Add langchain4j-jllama module: in-process LangChain4j adapters#284
Merged
bernardladenthin merged 1 commit intoJul 1, 2026
Merged
Conversation
Introduce a separate Maven artifact that adapts a java-llama.cpp LlamaModel to LangChain4j's model interfaces over JNI, with no HTTP hop: - JllamaChatModel -> ChatModel - JllamaStreamingChatModel -> StreamingChatModel (token streaming) - JllamaEmbeddingModel -> EmbeddingModel - JllamaScoringModel -> ScoringModel (rerank; scores aligned by input index) The adapters borrow a caller-owned LlamaModel and never close it. The module depends on langchain4j-core 1.17.1, but the core net.ladenthin:llama binding gains no langchain4j dependency, so plain users never pull it transitively. It is kept as a sibling module (not part of the root reactor) so the native build and release pipeline stay untouched, and it targets Java 17 to match the langchain4j 1.x baseline. The pure message/parameter/response transforms are unit-tested model-free; an end-to-end chat and streaming test self-skips when no GGUF is provided. The module README documents usage and the currently unmapped surfaces (tool calling, multimodal user input).
Author
|
This PR is just for me to integrate Jllama as native model within Langchain4j. This PR might mean a separate project. |
Owner
|
Hey @vaiju1981 , give me some minutes to integrate it well, then maybe do additional work. Thanks and bests! |
b5ee309
into
bernardladenthin:main
40 of 44 checks passed
5 tasks
vaiju1981
pushed a commit
to vaiju1981/java-llama.cpp
that referenced
this pull request
Jul 1, 2026
…Central publish Cleans up the integration of the merged langchain4j adapters (PR bernardladenthin#284) so the module is built, gated, version-locked and releasable — without touching the native build/release pipeline. - Rename artifact + directory langchain4j-jllama -> llama-langchain4j so it groups with the core net.ladenthin:llama family (Java package unchanged). - Pin the core dependency to ${project.version} (drops the drift-prone jllama.version property); a CI guard fails the build if the module version ever diverges from the core version (standalone module can't inherit it from a reactor). - Add per-artifact release plumbing (sources + javadoc + gpg + Central Publishing) mirroring the core release profile, so the module can deploy to Maven Central at the same version. - publish.yml: new test-java-llama-langchain4j job (install core Java jar, version-lockstep guard, mvn verify — builds the javadoc jar so a release-time javadoc break is caught in PR CI). publish-snapshot/publish-release now depend on it and deploy the module alongside the core. - REUSE.toml + README updated to the new name; CLAUDE.md documents the module, why it is a separate artifact (not a classifier), and the CI/publish wiring. Verified locally: core Java jar installs, module builds green (7 mapping tests pass, 2 model-backed integration tests self-skip), and the main/sources/javadoc jars all build under doclint=all. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Rt1paYztGJ2AKUuBuAGDXE
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new sibling Maven module
langchain4j-jllamathat adapts a java-llama.cppLlamaModeltoLangChain4j's model interfaces in-process over JNI — no HTTP hop, no separate
llama-server:JllamaChatModelChatModelLlamaModel.chat(...)JllamaStreamingChatModelStreamingChatModelLlamaModel.generateChat(...)(token streaming)JllamaEmbeddingModelEmbeddingModelLlamaModel.embed(...)JllamaScoringModelScoringModel(re-rank)LlamaModel.handleRerank(...)Design decisions
langchain4j-core:1.17.1, but thecore
net.ladenthin:llamabinding gains no langchain4j dependency — plain java-llama.cpp usersnever pull langchain4j (or its Java 17 floor) transitively.
untouched; the module builds independently against the published core jar. Targets Java 17
(langchain4j 1.x baseline; the core stays Java 8).
LlamaModeland never loads orcloses it. One model can back several adapters.
OpenAiCompatServer, langchain4j'slangchain4j-open-aiclient already works over HTTP with zero code. This module is for thein-process path (desktop / Android / embedded, no socket).
Testing
LangChain4jMappingTest,7 tests): role mapping, multimodal→text flattening, sampling-parameter pass-through, finish-reason
mapping, and rerank index alignment.
JllamaChatModelIntegrationTestruns a real chat + streaming round-trip and self-skips unless-Dnet.ladenthin.llama.model.path=...points at a GGUF (mirrors the existing model-gated tests).mvn test: 7 run, 2 skipped, green.Adapter contracts were verified against langchain4j 1.17.1 source:
doChatis the correct overridepoint, the chat response reports the model's real finish reason (
stop/length/tool_calls) andtoken usage, and streaming reports failures via
onError(the framework does not wrapdoChat).Not yet mapped (documented in the module README)
ToolSpecification↔ jllamaToolDefinition) — the main follow-up.response_format(JSON mode); multimodal user input is flattened to text.Notes for review
tests above were run locally. Happy to add a small job (
mvn -DskipTests installfor the core →cd langchain4j-jllama && mvn test) if you'd like it gated in CI.rather than in-tree, that's a clean move — let me know your preference.
REUSE.toml.